Picture for Sen Hu

Sen Hu

SMH-Bench: Benchmarking LLM Agents for Environment-Grounded Reasoning and Action in Smart Homes

Add code
Jun 01, 2026
Viaarxiv icon

HomeFlow: A Data Flywheel for Smart Home Agent Training with Verifiable Simulation

Add code
May 31, 2026
Viaarxiv icon

SkillGenBench: Benchmarking Skill Generation Pipelines for LLM Agents

Add code
May 18, 2026
Viaarxiv icon

EpochX: Building the Infrastructure for an Emergent Agent Civilization

Add code
Mar 28, 2026
Viaarxiv icon

QuantaAlpha: An Evolutionary Framework for LLM-Driven Alpha Mining

Add code
Feb 06, 2026
Viaarxiv icon

EvoFSM: Controllable Self-Evolution for Deep Research with Finite State Machines

Add code
Jan 14, 2026
Viaarxiv icon

Controlled Self-Evolution for Algorithmic Code Optimization

Add code
Jan 13, 2026
Viaarxiv icon

MemGovern: Enhancing Code Agents through Learning from Governed Human Experiences

Add code
Jan 13, 2026
Viaarxiv icon

RealMem: Benchmarking LLMs in Real-World Memory-Driven Interaction

Add code
Jan 11, 2026
Viaarxiv icon

Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning

Add code
Jan 11, 2026
Viaarxiv icon